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Abstract. Some temporal networks, most notably citation networks, are naturally represented as directed 
acyclic graphs (DAGs). To detect communities in DAGs, we propose a modularity for DAGs by defining an 
appropriate null model (i.e., randomized network) respecting the order of nodes. We implement a spectral 
method to approximately maximize the proposed modularity measure and test the method on citation 
networks and other DAGs. We find that the attained values of the modularity for DAGs are similar for 
partitions that we obtain by maximizing the proposed modularity (designed for DAGs), the modularity 
for undirected networks and that for general directed networks. In other words, if we neglect the order 
imposed on nodes (and the direction of links) in a given DAG and maximize the conventional modularity 
measure, the obtained partition is close to the optimal one in the sense of the modularity for DAGs. 

PACS. 89.75.Fb Structures and organization in complex systems - 89.75.He Networks and genealogical 
trees - 64.60.aq Networks 


1 Introduction 


Temporality and community structure are two common 
features present in various types of network data. Tem¬ 
porality of networks refers to nodes and links that vary 
over time. For example, a friendship link between a given 
pair of individuals is not always used even if they are close 
friends of each other. The link would be only occasionally 
active as the two individuals meet and then separate. Tem¬ 
porally varying networks are collectively called temporal 
networks 1|. Community structure posits that nodes or 
links in networks can be classified into groups, called com¬ 
munities [2j. Typically, a community is defined such that 
links are dense within a community and relatively sparse 
across different communities. Many networks in different 
domains have community structure. 

The two features can be naturally combined into com¬ 
munity detection in temporal networks, and several al¬ 
gorithms have been proposed to this aim. Examples in¬ 
clude cost minimization when temporal non-smoothness 


is a part of the cost function [3 
temporal smoothness constraints 


optimization under 
methods based on 


the Potts model [6 ( 7 , clique percolation [8], decomposi¬ 


tion of adjacency tensors (9 


for adjacency tensors |10f|12 
minimum description length principle 14 


generalization of modularity 
link clustering [l3], and the 


a The two authors contributed equally to the work. 
b e-mail: naoki . masudaObristol .ac.uk 


Related to temporal networks is the concept of directed 
acyclic graph (DAG). DAGs are directed networks with¬ 
out directed cycles. In DAGs, nodes can be positioned 
within a layer structure such that links only emanate from 
a node in a higher layer to a node in a lower layer (Fig. [l]). 
DAGs have been common as a tool for statistical inference 
for decades 15 16 . Equally importantly, we find various 
instances of DAGs in the real world such as some food 
webs 17 , some dominance hierarchy networks [l8jl9], ci¬ 
tation networks 20-23, family trees 24, and phyloge¬ 
netic networks 25 


Temporal networks can be mapped to DAGs in at least 
two ways. First, citation networks, a type of temporal net¬ 
work, can be naturally mapped to DAGs. A citation net¬ 
work is a directed network in which a node represents an 
article such as a scientific paper, patent, or court decision, 
depending on the network, and a link is directed from the 
citing to the cited nodes 26 . It is temporal in the sense 
that it grows over time due to the addition of new nodes 
and links [l . In principle, the links are directed backwards 
in time because newer nodes can cite older nodes but not 
vice versa, which makes the network a DAG. There may 
be links contradicting the arrow of time in real data sets, 
such as mutual citations [26|, which would violate the def¬ 
inition of a DAG . However, these links are relatively few 
(see section 


5.1 for exemplar numbers). 


Second, a family of temporal networks can be mapped 
to DAGs, as schematically shown in Fig. [2] Consider 
a sequence of adjacency matrices indexed by time, i.e., 
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(A ^) t —where t is discrete time, (A^)^ = 1 if i 
and j are adjacent to each other in the tth snapshot, and 
(A^)ij = 0 otherwise. By definition, (A^)ij = 1 im¬ 
plies that i and j contacted each other at some (continu¬ 
ous) time contained in the time interval [t,t + 1). Such a 
representation of temporal networks as sequences of ma¬ 
trices can be induced by the temporal resolution of the 
recording device or by the aggregation of continuous-time 
temporal network processes over a finite time window to 
create a snapshot |1|. In Fig. [2} each node i is duplicated 
T times, and each duplicated node is labelled (i,f), where 
1 < t < T. Therefore, there are NT nodes in total in 
the representation shown in Fig. [2] where N is the num¬ 
ber of nodes in the original temporal network. We draw a 
link from (i,t) to ( i,t + 1) for every node i (1 < i < N) 
and 1 < t < T. We also draw a fink between two nodes 
in subsequent layers if they are connected in the corre¬ 
sponding time interval. In other words, we draw a link 
from (j , t) to (i, t +1) if ( A W)^ = 1. In this way, a tempo¬ 
ral network given as a sequence of adjacency matrices is 
uniquely mapped to a DAG in which links only span be¬ 
tween two subsequent layers. This type of representation 
and its variants have been used in the analysis of temporal 
networks 27-29 . 


Community detection methods for temporal networks 
mentioned earlier in the present section are designed for 
the latter type of representation. However, to the best of 
our knowledge, community detection methods explicitly 
designed for the former type of temporal networks, or more 
generally DAGs, have not been proposed so far. Commu¬ 
nity structure in the former type of temporal networks 
has been studied using methods such as agglomerative hi¬ 
erarchical clustering 


30 , conventional undirected and di¬ 


31 , and the Infomap method 132 . 


rected modularity 
These detection methods are designed for static networks. 
They neither incorporate the acyclic nature of citation 
networks nor temporal information such as the publica¬ 
tion dates of articles. In these studies, temporal dynam¬ 
ics were analyzed once communities were obtained by the 
static methods. Therefore, if we apply existing methods, 
we have to map a temporal network of the former type to 
a static network by discarding the temporal information 
or to a snapshot representation. Both types of mapping 
seem to be suboptimal given the natural representation of 
the original network as a DAG. 


In the present study, we develop a community detec¬ 
tion method which exploits the intrinsic temporality in the 
former type of temporal networks. Technically, we pro¬ 
pose a community detection method for general DAGs. 
We do so by developing a modularity measure and a max¬ 
imization method for DAGs. A key to this development 
is the observation that the choice of a null model net¬ 
work characterizes communities to be detected. Null mod¬ 
els randomize finks while preserving some properties of 
the original network. Communities obtained by modular¬ 
ity maximization therefore are structures that are statis¬ 
tically surprising relative to the null model. Depending 
on the class of networks, specific null models have been 
proposed. Examples include networks without multilinks 


and selfloops 
works [35l|36 
works 


weighted networks 34 , directed net- 


multi-partite networks 37-39 , spatial net- 


networks with a similarity measure imposed 
networks directly formed by corre- 


and multilayer networks 10 -[12 


between nodes 
lation matrices 
which temporal networks are included as a special case. 
If we use an inappropriate null model to detect commu¬ 
nities in a DAG, we may obtain a suboptimal result. In 
particular, all aforementioned methods do not respect the 
directed layer structure inherent in a DAG. With such 
a method, a modularity value might be large simply be¬ 
cause a DAG is surprising as compared to a non-DAG 
null model. However, we are interested in modular struc¬ 
ture that a given DAG may have whereas reference DAGs 
do not. Therefore, in the present study we propose a mod¬ 
ularity maximization method that uses a null model for 
DAGs. 


2 DAG 


Let us denote a directed network by G = (V, E) with \ V\ = 
N nodes and \E\ = M directed links. We assume that G 
is a DAG. We also assume that G has L layers, that 
node i (1 < i < N) belongs to layer ti (1 < ti < L), 
and that any directed fink from j to i satisfies tj > ti 
(Fig-0- In words, any link emanates from a node in the 
upper layer (i.e., larger layer number) to a node in the 
lower layer (i.e., smaller layer number). No directed link 
exists between nodes in the same layer. A layer may in¬ 
clude multiple nodes. For example, in Fig. [I] layers l— 1, t, 
and t +1 contain three, two, and three nodes, respectively. 
No order is imposed between nodes in the same layer. 

Our definition is motivated by empirical observations 
that it is often more appropriate not to impose an order 
in subsets of nodes. For example, in citation networks, 
articles published on the same date cannot cite each other 
unless the two papers coordinate their citations [21 2.3 26 


In dominance hierarchy networks of animals, it may be 
difficult to rank individual animals (i.e., nodes) if they 
have similar physical strengths and no prior interaction, 
which would be the case in large colonies. 

We can always give such a layering to a given DAG 
although layering is not unique in general 43 . In the re¬ 
mainder of the present article, we simply refer to a DAG 
supplied with a proper layering {t \,... ,In} as a DAG. 


3 Modularity 


Communities detected through modularity maximization 
depend on the null network model that serves as a baseline 
for defining the modularity. In this section, we first briefly 
review modularity measures for undirected and directed 
networks allowing for cycles. Then, we formulate modu¬ 
larity for DAGs by proposing a new null model tailored to 
the case of DAGs. 
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Fig. 1. Schematic of a DAG magnified around layer £. The 
indices l — !,£,£+ 1 represent three subsequent layers from 
lower to upper. The filled circles represent the nodes in layer £. 
The quantities fie and Ar are given by the number of links 
passing through the corresponding vertical lines. 


3.2 Modularity for directed networks 


The modularity for directed networks is defined by 


Q dir = 


1 

M 


E 


i>j=l 



ifcout 


M 


( 2 ) 


where k™ and k° ut are the in-degree of node i and the 
out-degree of node j, respectively [36||45] . In general, A is 
allowed to be asymmetric, and A, 3 = 1 if there is a link 
from node j to i and A, l3 = 0 otherwise. In Q dlr , the num¬ 
ber of intra-community links is compared against the null 
model in which the expected number of links emanating 
from node j to i is equal to P dlr = fc‘ n fc° ut /M. In other 
words, the null model is the directed variant of the con¬ 
figuration model in which each node has the same in- and 
out-degrees as in the original network. 



Fig. 2. Schematic of the DAG representation of a temporal 
network, in which the link configurations between the same set 
of nodes change over time. The lower row shows snapshots of 
the network taken at times in [t— 1, t ) and [t, t+1), respectively. 


3.3 Modularity for DAGs 

Although it is possible to apply Q und and Q dir to detect 
communities in DAGs, neither of them incorporates the 
partial order imposed on nodes in DAGs. Therefore, we 
propose a null model for DAGs by generalizing the random 
DAG model proposed in Refs. 22|[23] (for other null mod¬ 


els of DAGs, see Ref. 46 


). In the random DAG model, 
links are randomized while preserving the in- and out- 
degree of each node and the order of nodes as specified 
by Li. The difference between the present definition and 
that in Refs. 22 23 is that the former allows a layer to 


contain multiple nodes, whereas each node constitutes a 
single layer in the latter case. 

Following the derivation in Ref. 


22 23 , we calculate 


the expected number of links P dag from node j to node i. 
We start by defining 


3.1 Modularity for undirected networks 

The modularity for undirected networks is defined by 


E c- E fc * ut , 

E c- E fcr = w-E fc i ut , 

i; l<£i<£ i'A=£ 


( 3 ) 

( 4 ) 


1 N 

Qund - V' 

^ 2 M ^ 

U.?=l 


An, 


hkj \ 

2 M J 


(1) 


tJ is the (L j) element of the adjacency matrix, 


where A. ^ 

and ki = YhjLi is the degree of node i 2 44 . The ad¬ 
jacency matrix is defined by A, 3 = 1 if nodes i and j are 
adjacent by a link and A. i3 = 0 otherwise. It is symmetric 
(i.e., Aij = A 3 i) for undirected networks. Function S is the 
Kronecker delta, and c; represents the index of the com¬ 
munity to which node i belongs. In Eq. the frequency 
of links within communities, corresponding to AijS Ci:Cj in 
Eq. is compared against that for the null model, cor¬ 
responding to kikjS Ci>c . /2M. Under the null model, the 
expected number of links between nodes i and j is equal 
to Pj“ nd = kik 3 /2M. In other words, the null model is 
the configuration model in which each node has the same 
degree as in the original network. 


where 1 < i < L. In words, fie is equal to the number 
of links from nodes in layers {£,£ + 1,..., Lj to nodes in 
layers {1, 2 ,...,£— 1}, and A £ is equal to the number of 
links from nodes in layers {£ + 1, £ + 2,..., L} to nodes in 
layers {1,2, 1}. Graphically, X e and [if are equal 

to the number of links crossing a section at £ (dotted line 
in Fig. [lj and one before layer £ (dashed line in Fig. [I]), 
respectively. In the example shown in Fig. [l] Xf = 1 
and /if = 4. 

The in-degree of node i, kf 1 , can be visualized as k\ n 
in-stubs attached to node i. where a stub is a half-link. 
Likewise for the out-degree. Joining an in-stub and an out- 
stub yields a link. We calculate the expected number for 
an in-stub attached to node i to connect to node j, which 
we denote by ptj. Let us first consider the case £ 3 = £i + 1. 
In this case, we obtain 


Pij = 


out 


Wi+1 


( 5 ) 
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because j possesses k° ut out-stubs and there are 
out-stubs from which an in-stub attached to node i can 
choose. The probability that an in-stub of node i does not 
connect to a node in layer Pi + 1 is given by 


1 - Pv = 1 “ 

r,lj=ti +1 


E LOUt 

j;U=L +i K j 
Vu- i-i 


At, 


+i 


Mi+l 


( 6 ) 


3 

Mi 


Using Eqs. ([7]) and Q, we obtain 


- Y\t=t . i +1 A ij 

nt 


Pij 


= k° 


( 8 ) 


(9) 


i+i 


Mi 


P dag = k in k m 


-X[i=ii +1 Atj 

rifcti+i Mj 


( 10 ) 




dag ' 


( 11 ) 


where P{( ag is given by 

'o 


pdag __ 


kin kout 


ru< i+ i At 
n/=t i+ i m 


A 0)> 

(Pi < ^i)- 


( 12 ) 


used to maximize Q dlr [2 36 47^ Here we adapt the spec¬ 


tral method 48 49 to the case of Q dag . 


4.1 Spectral method for undirected networks 

Before describing the spectral method for Q dag , we review 


the method for Q 


und 


48 


49 


The spectral method real- 


Now we assume in general that node j belongs to any 
layer Pj > Pi. Then, an in-stub of node i can connect to 
node j only if it does not connect to a node in layers Pi < 
i < Ij. By iteratively applying the argument leading to 
Eq. (p}, we find that the probability that an in-stub of 
node i does not to connect to any node in layers Pi < P < Pj 
is given by 

U- 1 X 

n —■ < j > 

p=Pi +1 

Provided that the in-stub of node i does not connect to a 
node in layers Pi < P < Pj, the probability that it connects 
to node j is given by 

k ont 


izes community detection by iteratively bipartitioning a 
tentative community. 

We initially partition the entire network into two 
groups of nodes. To this end, we use the fact that Q und is 
written in matrix form as 


nuna _ i j~> 

Q =4 M' B 


T ound, 


(13) 


where T denotes the transposition, s = (si ••• Sjv) T , 
Si £ {—1,1}, and B und is an N x TV symmetric matrix 
whose elements are given by B™ d = Ajj — (kikj/2M). 
Node i is classified to the first and second groups if.s, = — 1 
and Si = 1, respectively. 

Finding the vector s that maximizes Q und is an NP- 
complete problem 50 . A commonly applied heuristics is 


to relax elements of s to take continuous values and im¬ 
pose the normalization constraint s T s = N. Then, Q und 
is maximized when s is the eigenvector associated with 
the largest eigenvalue of B und , which we denote by u. We 
carry out bipartition by setting s 7 - = sgn(iq) (1 < i < N). 

The spectral method takes advantage of the fact that 
we can rapidly calculate u using the power method. In 
fact, the power method requires calculation of the product 
B und a; for changing x. Although this computation may 
look computationally costly because B und is a dense ma¬ 
trix, B und a: can be expressed as 


Because node i has kf 1 in-stubs, the expected number 
of links from j to i is given by 


B und x = Ax- 


k(k T x) 
2 M 


(14) 


Using P? ag , we define the modularity for DAGs, denoted 
by Q das , by 


where k = (k\, ■ ■ •, fcjv) 1 48 49 . The inner prod¬ 

uct k T x is calculated in time O(N), while the calcula¬ 
tion of Ax requires time 0(N + M). For sparse graphs, 
M = O(N) such that B und a; can be calculated in O(N) 
time, accelerating the power method. 

Once the network is partitioned into two tentative 
communities, we repeat selecting and bipartitioning one 
tentative community until Q und stops increasing. To de¬ 
cide whether to approve a proposed bipartition of a ten¬ 
tative community C into two groups of nodes C\ and C 2 , 
one needs to calculate the change in modularity by the 
partitioning, which is given by 


AQ und = 


1 

2 M 


T (Bif + B™ 1 )- £ B"" d 

ieCi,tec 2 i,jec 


4 Spectral methods for modularity 
maximization 

Because maximization of modularity is a combinationally 
hard problem, many approximate algorithms have been 
proposed for maximizing Q und | 2|, whose variants are also 


In matrix form, Z\Q und is written as 


/\/Ound _ _~T nund- 

V 4 M S 


(15) 


(16) 


where s is a column vector with length \C\, and _B und is 
a \C\ x \C\ symmetric matrix whose elements are defined 

. When C is the 


by B and = B™ d - E 


td und 
k£C D ik 


48 
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entire network, Z\Q und , B und , and s are equal to Q und , 
B und , and s , respectively. Under the constraint § T s = \C\, 
Eq. (16) is maximized if s is the eigenvector associated 


und 


= ^s T [B dag + (B dag ) T ] 


(17) 


[. B dag + (B dag ) T ] x similar to Eq. (14), we write 


pdag _ ^in^out 


*7 


where 


'0 


m-(j) = < nt 1 


(U ^ £j )j 


i+1 


A<> 


n/U+i w 


(U < (.j). 


(19) 


As discussed in detail in Ref. 23 , Eq. (19) when U < £j 
is identical to 


(Af i+l> • • • ; A^-i ^ 0 ), 
(otherwise), 

( 20 ) 




P dag _ 


/( U ow (*),^ p ( i ))'" 1 




( 21 ) 


It should be noted that P dag = 0 otherwise. Because 
A^ i+ i,..., X( i ^ 0 if and only if £ low (i) = £ low (j) and 


with the largest eigenvalue of B und , which we denote by u. 
Finally, we bipartition C by setting s, = sgn({q) (i £ C). 


4.2 Spectral method for directed networks 

The spectral method for directed networks was imple¬ 
mented in Ref. [36]. Here we briefly recapitulate their 
arguments. We can rewrite the modularity for directed 
network as Q dlr = (l/4M)s T [B dlr + (B dlr ) T ] s with 


(H dl % = Aij - (kf'k° ut )/M. Because B dir + (B dir ) 1 is 
a symmetric matrix, we can maximize Q dlr by replacing 
L? und in Eq. (14) by H dlr + (B dlT ) r and applying the same 
spectral method as that for Q und . 

To calculate the change in modularity Z\Q dlr caused 
by bipartitioning, we replace B und by H dlr + (H dlr ) T in 
Eqs. (fl5|) and (16). Then, we maximize AQ dlr by following 


the same steps as for maximizing AQ 


4.3 Spectral method for DAGs 

The application of the spectral method to Q dag is straight¬ 
forward. For Q dag , a single step in the power method can 
be accelerated as in the case of Q und and Q dir as follows. 
First, we write 


where, £ low (i) is the largest value of £(< U) such that 
Ag = 0, and £ up (j) is the smallest value of £{> £j) such 
that At = 0. If there is no £(< U) or £(> lj) such that 
A( = 0, we obtain £ low (i) = 1 and £ up (j) = L , respectively. 
In words, f(£i,£j) > 0 if and only if for every layer between 
£i and £j, we find at least one link penetrating the layer. 
Using Eq. (20), we decompose P dag in the case U < £j 
and A^+i,...,A £ i ^ 0 as 


£ up (i) = £ up (j), we can rewrite Eq. (21) as 

pdag _ f{£t,£ P W) fin r /alow/ -\ » ^ r,out foo\ 

V ~ f(£ l ™(i ),£*(*))* [3)y j)l j ■ [ ’ 

For each layer £, we define K ln (i) = {k'i{£) ■ ■ ■ k 1 £(£)) t 
and K out (£) = («? ut {£) • • • «w*(£)) T by 


0 


(ii ¥= t), 


«“(*) = { 7 in (p = ( 23 ) 

J(£ l ™(i),£" p (i)) * l) ’ 


and 


k™\£) = 


{£ < & w {i) or £i<£), 


f(£ low (i),£i)k^ (£ low (i) <£< £i), 


(24) 


respectively. By combining Eqs. (221, (23) and (24), we 
obtain 


}dag 


= c(u)«r‘(^) = E c(^K ut w- ( 25 ) 


i=\ 


Finally, we decompose B dag x as 


B dag x = Ax-J2 « in (^) (k ou \£) t x) . 


(26) 


where B dag = A,j — P dag . To find a decomposition of 

(18) 


Therefore, once {K m (^),/« out (£)} 1< have been calcu¬ 
lated beforehand, the computation of B dag x takes 0(LN) 
time, and likewise for (P dag ) T a;. 

To implement repeated bipartitioning, we maximize 
the change in modularity, AQ dag , which a biparition of a 
community causes. We calculate AQ dag by replacing P und 
by P dag + (P dag ) T in Eqs. ^ and @.'Then, we follow 
the same steps as for maximizing Z\Q und . 


5 Results 

In this section, we test the proposed spectral method 
for maximizing Q dag for several DAG data. We test the 
method against the spectral method maximizing Q und and 
Q dlr and the so-called Louvain method, which is a non¬ 
spectral algorithm that heuristically maximizes Q und 51 


We use the implementation of the Louvain method in the 
igraph package of R 52 . In our implementation of the 


spectral method, we add a fine tuning step after every 
bipartitioning in a similar fashion to Refs. 48 49] (Ap- 


pendix|A]). We do not apply this fine tuning to the Louvain 
method because this fine tuning is specialized to the spec¬ 
tral method. We also apply a postprocessing procedure 
after the spectral method or the Louvain method has ter¬ 
minated (Appendix[B]). To apply the spectral method and 
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Table 1. Properties of the networks analyzed in section|5| The 
number of nodes, that of links, and that of layers are denoted 
by N, M, and L, respectively. These values refer to those of 
the largest weakly connected component of each network. 


Data 

N 

M 

L 

HepPh 

30,337 

344,578 

3,683 

HepTh 

27,377 

351,033 

4,139 

E. coli 

328 

456 

5 

Cl 

20 

29 

4 

C2 

32 

55 

7 

C3 

48 

134 

8 

C4 

70 

158 

7 

C6 

64 

137 

8 


the Louvain method to maximize Q und , and also to calcu¬ 
late Q und for the final partition obtained by various meth¬ 
ods, we ignore link direction, or equivalently, use A + A T 
as the adjacency matrix of the undirected network. 

To quantify similarity between two partitions of the 
same network, we calculate the Jaccard index for each 
pair of partitions [2]. The Jaccard index is defined by 


J = 


«i 

a\ + a 0 ’ 


(27) 


where ao and ai are the number of node pairs that are 
classified to the same community in only one partition 
and in both partitions, respectively. If J = 1, the two 
partitions are exactly the same. If J = 0, they completely 
disagree. 


Table 2. Modularity values for the citation networks. The 
results obtained by maximization of Q und , Q dlr , and Q dag us¬ 
ing the spectral method are shown in the rows labelled “s- 
und”, “s-dir”, and “s-dag”, respectively. The results obtained 
by maximization of Q und using the Louvain method are shown 
in the rows labeled “L-und”. N c denotes the number of com¬ 
munities. 


Data 

Method 

N e 

Q und 

Q dir 

Q dag 

HepPh 

s-und 

19 

0.7292 

0.7292 

0.7259 


s-dir 

18 

0.7288 

0.7288 

0.7257 


s-dag 

16 

0.7300 

0.7300 

0.7263 


L-und 

19 

0.7331 

0.7332 

0.7292 

HepTh 

s-und 

21 

0.6569 

0.6571 

0.6332 


s-dir 

30 

0.6447 

0.6450 

0.6207 


s-dag 

20 

0.6452 

0.6453 

0.6320 


L-und 

28 

0.6558 

0.6561 

0.6266 


Table 3. Jaccard indices between the partitions obtained for 
the citation networks. 


Data 

Method 

s-und 

s-dir 

s-dag 

L-und 

HepPh 

s-und 

1 

0.6383 

0.4116 

0.4287 


s-dir 

0.6383 

1 

0.4298 

0.3905 


s-dag 

0.4116 

0.4298 

1 

0.4557 


L-und 

0.4287 

0.3905 

0.4557 

1 

HepTh 

s-und 

1 

0.3154 

0.2918 

0.2859 


s-dir 

0.3154 

1 

0.3417 

0.3262 


s-dag 

0.2918 

0.3417 

1 

0.2804 


L-und 

0.2859 

0.3262 

0.2804 

1 


5.1 Citation networks 


As an example of temporal networks represented by 
DAGs, we study two citation networks of articles posted 
on the e-print archive arXiv.org. We use data on the High 
Energy Physics Phenomenology (HepPh) and High En¬ 
ergy Physics Theory (HepTh) sections of arXiv, which ex¬ 
haustively cover the citations between January 1993 and 


April 2003 53-55 


The HepPh and HepTh data contain 34,546 and 27,770 
nodes, and 421,578 and 352,807 links, respectively. There 
are four mutually exclusive types of nodes and links ex¬ 
cluded from the following analysis. First, we discard 44 
and 39 self-loops in the HepPh and HepTh networks, re¬ 
spectively. Second, we discard 2,622 and 1,475 links in 
HepPh and HepTh networks, respectively, which contra¬ 
dict the arrow of time (i.e., articles citing others newer 
than themselves). Third, in the publicly available data 
of the HepPh network, the information about the publi¬ 
cation date is not provided for the articles posted after 
11 March 2002. Because we cannot assign a date-based 
layer ^ to these articles, we remove 3,985 such articles. 
Fourth, we remove 74,246 links in the HepPh network 
that are incident to at least one node whose date infor¬ 
mation is missing. Although we have removed a fraction 
74,276/421,578 = 0.176 of the links according to the 


Table 4. Modularity values for the E. coli transcriptional 
regulation network. See the caption of Table [2] for legends. 


Data 

Method 

N c 

Q und 

Q dir 

Q dag 

E. coli 

s-und 

13 

0.7366 

0.7373 

0.6998 


s-dir 

13 

0.7325 

0.7332 

0.6938 


s-dag 

13 

0.7136 

0.7142 

0.6988 


L-und 

13 

0.7357 

0.7364 

0.6972 


Table 5. Jaccard indices between the partitions obtained for 
the E. coli transcriptional regulation network. 


Data 

Method 

s-und 

s-dir 

s-dag 

L-und 

E. coli 

s-und 

1 

0.7885 

0.6747 

0.8875 


s-dir 

0.7885 

1 

0.5998 

0.7927 


s-dag 

0.6747 

0.5998 

1 

0.6699 


L-und 

0.8875 

0.7927 

0.6699 

1 


fourth criterion, all removed links are those incident to 
the newest nodes. Therefore, the network restricted to the 
dates before 11 March 2002, which we effectively consider 
in the following, is not affected by the removal of these 
links. 

After removing self-loops, links contradicting the ar¬ 
row of time, nodes without the information about the 
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publication date, and the links incident to these nodes, 
we obtain 30,561 nodes and 344,666 links in the HepPh 
network and 27,770 nodes and 351,293 links in the HepTh 
network. We use the largest weakly connected component 
of each network. Layers are defined by the date of publi¬ 
cation. It should be noted that some layers had no node 
because no article was published on that date. This fact 
does not change the following results because Q dag and its 
optimization procedure described in the previous sections 
are not affected by empty layers. Some basic quantities of 
the largest weakly connected component of the two net¬ 
works are summarized in Table [lj 

Modularity values obtained after the maximization 
procedure are summarized for the two networks in Ta¬ 
ble [2j A row corresponds to a combination of the maxi¬ 
mized modularity measure (i.e., Q und , Q dlr , or Q dag ) and 
the maximization method (i.e., spectral or Louvain). For 
example, s-und represents the spectral method for maxi¬ 
mizing Q und . The columns correspond to the three mod¬ 
ularity values measured at the end of the maximization 
procedure, given a modularity maximization method cor¬ 
responding to a row. 

The modularity values shown in Table [2] seem to be 
large. It turns out that, in both networks, a partitioning 
method designed for Q und (i.e., L-und for the HepPh net¬ 
work and s-und for the HepTh network) has yielded the 
largest value for not only Q und but also Q dlr and Q dag . 
Therefore, for these networks, it is better to use a method 
designed for maximizing Q und to (also) maximize Q dag 
than a method designed for maximizing Q dag . Table[2]also 
indicates that the spectral method designed for maximiz¬ 
ing Q dag (s-dag) does not fall far behind s-und or L-und 
in terms of the obtained Q dag value. In fact, regardless of 
the method, we obtained similar values of Q und , Q dlr , and 



The similarity between each pair of the four optimized 
partitions for each network (Table [2]) is shown in Table [3j 
where similarity is measured in terms of the Jaccard in¬ 
dex. We find that the Jaccard indices are not very large, 
in particular for the HepTh network, although the differ¬ 
ent partitioning methods have yielded close modularity 
values, as shown in Table [2j Nevertheless, the Jaccard in¬ 
dex may not be very large even between partitions with 
close modularity values obtained from the same stochastic 
partitioning algorithm applied to the same network. For 
example, the Jaccard index values between 0.5 and 0.9 re¬ 
ported in Ref. 56 are larger than but comparable with 


the present values. It has also been reported that parti¬ 
tioning results obtained from the same network yielding 
close modularity values can be fairly different [2 57 . 


5.2 Transcriptional regulation network 


We study the compartmentalization of the transcrip¬ 
tional regulation network of the bacteria Escherichia 
coli ( E. coli ) using publicly available data 58 59 . Nodes 
of this network are operons. Links are directed from the 
operon that encodes a transcription factor to an operon 
regulated by the transcription factor. The network, which 


Table 6. Modularity values for five dominance networks. The 
results obtained with the optimal modularity maximization 
method are shown in rows labeled “o-und”, “o-dir”, and “o- 
dag”. In the table, indicates that the optimal method has 
not terminated. See the caption of Table [2] for the other leg¬ 
ends. 


Data 

Method 

N c 

Q und 

Q dir 

Q dag 

Cl 

s-und 

4 

0.2990 

0.3258 

0.3393 


s-dir 

3 

0.2943 

0.3317 

0.3421 


s-dag 

3 

0.2812 

0.3282 

0.3468 


L-und 

4 

0.3288 

0.3401 

0.3633 


o-und 

4 

0.3288 

0.3401 

0.3633 


o-dir 

4 

0.3157 

0.3484 

0.3662 


o-dag 

4 

0.3157 

0.3484 

0.3662 

C2 

s-und 

4 

0.3522 

0.3593 

0.3729 


s-dir 

4 

0.3607 

0.3702 

0.3951 


s-dag 

4 

0.3607 

0.3702 

0.3951 


L-und 

4 

0.3607 

0.3702 

0.3951 


o-und 

4 

0.3607 

0.3702 

0.3951 


o-dir 

5 

0.3526 

0.3716 

0.3890 


o-dag 

4 

0.3607 

0.3702 

0.3951 

C3 

s-und 

5 

0.2339 

0.2372 

0.2503 


s-dir 

5 

0.2311 

0.2594 

0.2643 


s-dag 

6 

0.2416 

0.2531 

0.2654 


L-und 

3 

0.2378 

0.2421 

0.2572 


o-und 

- 

- 

- 

- 


o-dir 

5 

0.2311 

0.2594 

0.2643 


o-dag 

6 

0.2416 

0.2531 

0.2654 

C4 

s-und 

5 

0.2690 

0.2782 

0.2882 


s-dir 

6 

0.2612 

0.2788 

0.2899 


s-dag 

5 

0.2604 

0.2767 

0.2914 


L-und 

5 

0.2626 

0.2656 

0.2766 


o-und 

- 

- 

- 

- 


o-dir 

- 

- 

- 

- 


o-dag 

- 

- 

- 

- 

C6 

s-und 

6 

0.3374 

0.3396 

0.3512 


s-dir 

5 

0.3308 

0.3422 

0.3412 


s-dag 

6 

0.3348 

0.3428 

0.3600 


L-und 

5 

0.3467 

0.3504 

0.3493 


o-und 

5 

0.3487 

0.3529 

0.3521 


o-dir 

5 

0.3487 

0.3529 

0.3521 


o-dag 

6 

0.3356 

0.3440 

0.3606 


is a DAG, contains 423 nodes and 519 directed links with¬ 
out self loops. We extracted the largest weakly connected 
component, which consisted of 328 nodes and 456 links. 


Layers were determined by a leaf-removal algorithm 43 


In this algorithm, the nodes with out-degree zero are clas¬ 
sified to the lowest layer. Then, these nodes together with 
links connecting to these nodes are removed. The nodes 
with out-degree zero in the remaining network are classi¬ 
fied to next lowest layer. This procedure is iterated until 
all nodes are exhausted. The basic quantities of this net¬ 
work are summarized in Table [T] 

The modularity values obtained from the four modu¬ 
larity maximization algorithms are shown in Table [4] Al¬ 
gorithm s-und produces the largest Q und , Q dlr , and Q dag , 
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Table 7. Jaccard indices between the partitions obtained for the dominance networks. 


Data 

Method 

s-und 

s-dir 

s-dag 

L-und 

o-und 

o-dir 

o-dag 

Cl 

s-und 

1 

0.7288 

0.6452 

0.5965 

0.5965 

0.5439 

0.5439 


s-dir 

0.7288 

1 

0.5946 

0.4658 

0.4658 

0.4857 

0.4857 


s-dag 

0.6452 

0.5946 

1 

0.5970 

0.5970 

0.7049 

0.7049 


L-und 

0.5965 

0.4658 

0.5970 

1 

1 

0.8235 

0.8235 


o-und 

0.5965 

0.4658 

0.5970 

1 

1 

0.8235 

0.8235 


o-dir 

0.5439 

0.4857 

0.7049 

0.8235 

0.8235 

1 

1 


o-dag 

0.5439 

0.4857 

0.7049 

0.8235 

0.8235 

1 

1 

C2 

s-und 

1 

0.7083 

0.7083 

0.7083 

0.7083 

0.5753 

0.7083 


s-dir 

0.7083 

1 

1 

1 

1 

0.7724 

1 


s-dag 

0.7083 

1 

1 

1 

1 

0.7724 

1 


L-und 

0.7083 

1 

1 

1 

1 

0.7724 

1 


o-und 

0.7083 

1 

1 

1 

1 

0.7724 

1 


o-dir 

0.5753 

0.7724 

0.7724 

0.7724 

0.7724 

1 

0.7724 


o-dag 

0.7083 

1 

1 

1 

1 

0.7724 

1 

C3 

s-und 

1 

0.2727 

0.2755 

0.3028 

- 

0.2727 

0.2755 


s-dir 

0.2727 

1 

0.6687 

0.3275 

- 

1 

0.6687 


s-dag 

0.2755 

0.6687 

1 

0.3548 

- 

0.6687 

1 


L-und 

0.3028 

0.3275 

0.3548 

1 

- 

0.3275 

0.3548 


o-und 

- 

- 

- 

- 

1 

- 

- 


o-dir 

0.2727 

1 

0.6687 

0.3275 

- 

1 

0.6687 


o-dag 

0.2755 

0.6687 

1 

0.3548 

- 

0.6687 

1 

C4 

s-und 

1 

0.5726 

0.5928 

0.4807 

- 

- 

- 


s-dir 

0.5726 

1 

0.8047 

0.6039 

- 

- 

- 


s-dag 

0.5928 

0.8047 

1 

0.5827 

- 

- 

- 


L-und 

0.4807 

0.6039 

0.5827 

1 

- 

- 

- 


o-und 

- 

- 

- 

- 

1 

- 

- 


o-dir 

- 

- 

- 

- 

- 

1 

- 


o-dag 

- 

- 

- 

- 

- 

- 

1 

C6 

s-und 

1 

0.5653 

0.4153 

0.5841 

0.5697 

0.5697 

0.3254 


s-dir 

0.5653 

1 

0.3055 

0.5779 

0.6641 

0.6641 

0.2852 


s-dag 

0.4153 

0.3055 

1 

0.3699 

0.3121 

0.3121 

0.7118 


L-und 

0.5841 

0.5779 

0.3699 

1 

0.7588 

0.7588 

0.2959 


o-und 

0.5697 

0.6641 

0.3121 

0.7588 

1 

1 

0.3448 


o-dir 

0.5697 

0.6641 

0.3121 

0.7588 

1 

1 

0.3448 


o-dag 

0.3254 

0.2852 

0.7118 

0.2959 

0.3448 

0.3448 

1 


similarly to the results for the HepTh network (Table [2]). 
The Jaccard indices are shown in Table [5] They are much 
larger than those for the citation networks (Table |3|, sug¬ 
gesting that the partitions obtained by the different meth¬ 
ods are relatively similar to each other for the present net¬ 
work. 


5.3 Dominance networks 


As a final example, we examine DAGs induced by dom¬ 
inance hierarchies in ant colonies using the data in 
Ref. 18 . The data set contains aggression-based hierarchy 


among workers in six ant colonies of species Diacamma sp. 
Nodes represent individual female ant workers and a link 
is drawn from the attacking to the attacked ants, where an 
attack is defined as a bite and jerk. Layers are determined 
by the leaf-removal algorithm used for the transcriptional 
regulation network. The data set contains six colonies, five 


of whose largest weakly connected components are DAGs 
that we use in the following analysis. Basic quantities of 
the largest weakly connected component of the five DAGs 
are summarized in Table [l] 


The present DAGs are smaller than those analyzed 
in the previous sections. Therefore, for some networks, 
we were able to calculate the partition that exactly 
maximized the modularity using integer linear program¬ 
ming 50 . In this method, originally applied to Q und , vari¬ 


ables Xij £ {0,1} are defined for every pair of nodes i 
and j. where X l3 = 1 {X^ = 0) indicates that i and j 
are classified to the same community (different communi¬ 
ties). Then, we rewrite the modularity in terms of Xjj by 
replacing S CiCj by X^ in Eqs. Q. ([2]), and for Q und , 
Q dlr , and Q dag , respectively. The variables need to satisfy 
X-i j — 1, X ij — Xj . and X^j T Xj f, 2 A ? /, A 1 for all 
1 < h,i,j < N, which defines an integer linear program 
for exact maximization of modularity. 
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The results for modularity maximization and the Jac- 
card indices for the obtained partitions are shown in Ta¬ 
bles [6] and [ 7 ] respectively. Table [g] indicates that the ob¬ 
tained modularity values vary across colonies (i.e., net¬ 
works) and do not solely depend on the size of the network. 
Therefore, different colonies may have different degrees of 
community structure. The table also indicates that s-dag 
realizes the largest Q dag values in all colonies except Cl, 
among the four heuristic methods (i.e., s-und, s-dir, s- 
dag, and L und). Algorithm s-dag realizes the optimal 
Q dag value obtained by o-dag for C2 and C3. The Q dag 
value obtained by s-dag is also close to that obtained by o- 
dag for C6, whereas this is not the case for Cl. Similarity 
among the partitions obtained by the different algorithms 
depends on colonies without clear patterns (Tabic [ 7 ]). 


6 Conclusions 

We have proposed a modularity measure for DAGs and 
a spectral algorithm to maximize it by extending the 
spectral algorithm developed for undirected networks. We 
found that the obtained modularity values are rather in¬ 
dependent of whether we used the proposed algorithm 
or other algorithms known for less restricted networks 
such as undirected or directed networks allowing for cy¬ 
cles. Therefore, up to our numerical efforts, simply apply¬ 
ing modularity maximization methods for undirected net¬ 
works to DAGs may be practically innocuous. We stress 
that we have reached this conclusion by actually develop¬ 
ing a modularity measure for DAGs and testing it against 
previous methods using several data sets. 

The spectral method was presented as an example 
heuristic for maximizing the proposed modularity mea¬ 
sure. The proposed modularity measure can be also maxi¬ 
mized by other approximate methods, which may surpass 
the performance of the spectral method. 

We acknowledge Steve Gregory and Kohei Tamura for 
discussion and careful reading of the manuscript. L.S. ac¬ 
knowledges the support provided through DAAD. N.M. 
acknowledges the support provided through JST, CREST, 
and JST, ERATO, Kawarabayashi Large Graph Project. 

T.T. and N.M. designed the research. L.S., T.T., and 
N.M. developed the theory. L.S. and T.T. analyzed the 
data. L.S., T.T., and N.M. discussed the results and wrote 
the paper. 


A Fine tuning after bipartitioning 

In the spectral methods for maximizing Q und , <5 dlr , and 
Q dag , we carried out the following fine tuning procedure 
every time after a community was bipartitioned. Sup¬ 
pose that we have bipartitioned a community C into two 
communities C\ and Ci. For every node in C, we ten¬ 
tatively moved the node to the opposite community and 
calculated the change in the targeted modularity value. 
Then, we adopted the attempted move that maximized 
the modularity under the condition that the modularity 


increased by the move. We repeated this procedure under 
the constraint that each node was moved at most once, 
until no further increase in modularity was possible. This 
fine tuning procedure slightly modifies that proposed for 
Q und |48}|49| . We modified the original procedure because 
it involved repetition of a procedure similar to the afore¬ 
mentioned one and did not terminate on our data when 
we attempted to maximize Q dag . 


B Postprocessing 


After the completion of modularity maximization with 
the spectral or Louvain method, we carried out the fol¬ 
lowing postprocessing algorithm to enhance the targeted 
modularity value. Our postprocessing algorithm resem¬ 
bles that employed in other modularity maximization al¬ 


gorithms 51 


60 


Step 1: For a given node i (1 < i < TV), we tentatively 
moved it to the community to which each adjacent node 
of i belonged. The move attempt that increased the mod¬ 
ularity by the largest amount was adopted. We scanned 
the N nodes in random order. We neglected the direction 
of link when judging adjacency in the case of directed net¬ 
works including DAGs. 

Step 2: First, we selected the smallest community in 
terms of the number of nodes, called the seed commu¬ 
nity. Second, we tentatively merged the seed community 
with each of the remaining communities to which the seed 
community is directly connected by a link regardless of 
the direction. Third, we measured the modularity value. 
We adopted the merge attempt that yielded the largest 
modularity but only when the modularity increased by 
the merge. Then, among the community that had never 
been selected as seed community, we selected the smallest 
community as seed community and attempted merger. We 
repeated the same procedure until no further increase in 
the modularity occurred. 

We applied step 1 and then step 2. We repeated the 
combination of the two steps ten times. We also verified 
that swapping the order of the two steps had little impacts 
on the final modularity value. 
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